Skip to content

Conversation

camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Nov 17, 2024

This commit removes the use of the kube-rbac-proxy image and replaces it with metrics authentication/authorization provided by controller-runtime. The kube-rbac-proxy image is deprecated and will no longer be maintained, which introduces risks to production environments. For more details, see: kubernetes-sigs/kubebuilder#3907

PR for catalogd: operator-framework/catalogd#460
Motivation: #1509

Tests done locally

To ensure backwards compatibility

Create the clusterrolebinding

$ kubectl create clusterrolebinding operator-controller-metrics-binding \
   --clusterrole=operator-controller-metrics-reader \
   --serviceaccount=olmv1-system:operator-controller-controller-manager

Create the TOKEN

TOKEN=$(kubectl create token operator-controller-controller-manager -n olmv1-system)
echo $TOKEN

Use curl to validate

kubectl label namespace olmv1-system pod-security.kubernetes.io/enforce-
kubectl run curl-metrics --rm -it --restart=Never \
  --image=curlimages/curl:7.87.0 -n olmv1-system -- /bin/sh

Call the metrics:

 curl -v -k -H "Authorization: Bearer $TOKEN" \
> https://operator-controller-controller-manager-metrics-service.olmv1-system.sv
c.cluster.local:8443/metrics
*   Trying 10.96.16.165:8443...
* Connected to operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local (10.96.16.165) port 8443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Client hello (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Server hello (2):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Certificate (11):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, CERT verify (15):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Finished (20):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: [NONE]
*  start date: Dec 12 20:40:15 2024 GMT
*  expire date: Mar 12 20:40:15 2025 GMT
*  issuer: CN=olmv1-ca
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /metrics]
* h2h3 [:scheme: https]
* h2h3 [:authority: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443]
* h2h3 [user-agent: curl/7.87.0-DEV]
* h2h3 [accept: */*]
* h2h3 [authorization: Bearer TOKEN
* Using Stream ID: 1 (easy handle 0xffff82fa3aa0)
> GET /metrics HTTP/2
> Host: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443
> user-agent: curl/7.87.0-DEV
> accept: */*
> authorization: Bearer TOKEN 
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 200 
< content-type: text/plain; version=0.0.4; charset=utf-8; escaping=values
< date: Thu, 12 Dec 2024 20:43:10 GMT
< 
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 1
# HELP controller_runtime_active_workers Number of currently used workers per controller
# TYPE controller_runtime_active_workers gauge
....
workqueue_work_duration_seconds_count{controller="clusterextension",name="clusterextension"} 0
* Connection #0 to host operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local left intact

Now, to validate the call with the certificates

Create the POD with the secret

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: curl-metrics
  namespace: olmv1-system
spec:
  serviceAccountName: operator-controller-controller-manager
  containers:
  - name: curl
    image: curlimages/curl:7.87.0
    command:
    - sh
    - -c
    - sleep 3600
    volumeMounts:
    - mountPath: /tmp/cert
      name: olm-cert
      readOnly: true
  volumes:
  - name: olm-cert
    secret:
      secretName: olmv1-cert
  restartPolicy: Never
EOF
$ kubectl exec -it curl-metrics -n olmv1-system -- sh

$ curl -v --cacert /tmp/cert/ca.crt --cert /tmp/cert/tls.crt --key /tmp/cert/t
ls.key \
> -H "Authorization: Bearer $TOKEN" \
> https://operator-controller-controller-manager-metrics-service.olmv1-system.sv
c.cluster.local:8443/metrics
*   Trying 10.96.16.165:8443...
* Connected to operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local (10.96.16.165) port 8443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /tmp/cert/ca.crt
*  CApath: none
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Client hello (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Server hello (2):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Certificate (11):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, CERT verify (15):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Finished (20):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: [NONE]
*  start date: Dec 12 20:40:15 2024 GMT
*  expire date: Mar 12 20:40:15 2025 GMT
*  subjectAltName: host "operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local" matched cert's "operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local"
*  issuer: CN=olmv1-ca
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /metrics]
* h2h3 [:scheme: https]
* h2h3 [:authority: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443]
* h2h3 [user-agent: curl/7.87.0-DEV]
* h2h3 [accept: */*]
* h2h3 [authorization: Bearer TOKEN
* Using Stream ID: 1 (easy handle 0xffffb0cf9aa0)
> GET /metrics HTTP/2
> Host: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443
> user-agent: curl/7.87.0-DEV
> accept: */*
> authorization: Bearer TOKEN
> 
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 200 
< content-type: text/plain; version=0.0.4; charset=utf-8; escaping=values
< date: Thu, 12 Dec 2024 20:46:29 GMT
< 

@camilamacedo86 camilamacedo86 requested a review from a team as a code owner November 17, 2024 10:19
Copy link

netlify bot commented Nov 17, 2024

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit fccc728
🔍 Latest deploy log https://app.netlify.com/sites/olmv1/deploys/6764843156b32600086cd7c6
😎 Deploy Preview https://deploy-preview-1475--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 3 times, most recently from 4349f0e to 48dc64a Compare November 17, 2024 10:22
Copy link

codecov bot commented Nov 17, 2024

Codecov Report

Attention: Patch coverage is 51.66667% with 29 lines in your changes missing coverage. Please review.

Project coverage is 74.13%. Comparing base (b67bd38) to head (fccc728).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
cmd/manager/main.go 51.66% 24 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1475      +/-   ##
==========================================
- Coverage   74.65%   74.13%   -0.52%     
==========================================
  Files          42       42              
  Lines        3271     3329      +58     
==========================================
+ Hits         2442     2468      +26     
- Misses        653      679      +26     
- Partials      176      182       +6     
Flag Coverage Δ
e2e 52.11% <51.66%> (-0.04%) ⬇️
unit 56.86% <0.00%> (-1.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 48dc64a to 3487ab3 Compare November 18, 2024 12:02
@joelanford
Copy link
Member

/hold

This doesn't seem like a change that should be made after we've released 1.0.0-rc1. Let's wait until after we cut 1.0.0.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 18, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 3487ab3 to a7ace73 Compare November 18, 2024 13:35
@camilamacedo86

This comment was marked as resolved.

@joelanford
Copy link
Member

Let's prioritize a design for this. I do not want to add any risk to our 1.0.0 in the final week after we've already released an RC.

@camilamacedo86
Copy link
Contributor Author

/hold

@camilamacedo86 camilamacedo86 changed the title ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization (WIP) - POC - RFC ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization Nov 19, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 938fb19 to 7e22db8 Compare November 19, 2024 19:06
@camilamacedo86 camilamacedo86 changed the title (WIP) - POC - RFC ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization (WIP) - ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization Nov 19, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 4 times, most recently from aaf8ce9 to e3b9df9 Compare November 19, 2024 19:38
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 5 times, most recently from a5d0cbc to 1488a45 Compare December 19, 2024 17:57
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 1488a45 to 423bf5d Compare December 19, 2024 18:02
Copy link
Member

@LalatenduMohanty LalatenduMohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 19, 2024
@LalatenduMohanty
Copy link
Member

/hold for @trgeiger to take a look as well

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 19, 2024
…ontroller-runtime feature

Utilise Controller-Runtime's WithAuthenticationAndAuthorization feature to protect the metrics endpoint. This approach provides access control, similar to the functionality of kube-rbac-proxy. kube-rbac-proxy image from gcr.io/kubebuilder/kube-rbac-proxy is deprecated and should no longer be used

More info: kubernetes-sigs/kubebuilder#3907
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 423bf5d to fccc728 Compare December 19, 2024 20:38
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 19, 2024
Copy link
Contributor

@trgeiger trgeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@camilamacedo86
Copy link
Contributor Author

/hold false

@LalatenduMohanty
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 19, 2024
Copy link
Member

@LalatenduMohanty LalatenduMohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@camilamacedo86 camilamacedo86 added this pull request to the merge queue Dec 19, 2024
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 19, 2024
Merged via the queue into operator-framework:main with commit 10f0f77 Dec 19, 2024
17 of 19 checks passed
@camilamacedo86 camilamacedo86 deleted the replace-rbac-protect-metrics branch December 19, 2024 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants